Automated detection scheme for acute myocardial infarction using convolutional neural network and long short-term memory

The early detection of acute myocardial infarction, which is caused by lifestyle-related risk factors, is essential because it can lead to chronic heart failure or sudden death. Echocardiography, among the most common methods used to detect acute myocardial infarction, is a noninvasive modality for the early diagnosis and assessment of abnormal wall motion. However, depending on disease range and severity, abnormal wall motion may be difficult to distinguish from normal myocardium. As abnormal wall motion can lead to fatal complications, high accuracy is required in its detection over time on echocardiography. This study aimed to develop an automatic detection method for acute myocardial infarction using convolutional neural networks (CNNs) and long short-term memory (LSTM) in echocardiography. The short-axis view (papillary muscle level) of one cardiac cycle and left ventricular long-axis view were input into VGG16, a CNN model, for feature extraction. Thereafter, LSTM was used to classify the cases as normal myocardium or acute myocardial infarction. The overall classification accuracy reached 85.1% for the left ventricular long-axis view and 83.2% for the short-axis view (papillary muscle level). These results suggest the usefulness of the proposed method for the detection of myocardial infarction using echocardiography.


Introduction
Acute myocardial infarction (AMI) is a disease in which myocardial cells become necrotic due to thrombus formation or blood vessel occlusion. AMI causes severe chest pain and requires immediate treatment, such as percutaneous transluminal coronary recanalization or coronary artery bypass grafting. It is important to diagnose AMI as early as possible because it can lead to heart failure, arrhythmia, or sudden death.
Echocardiography, a noninvasive imaging modality used to diagnose AMI, enables the realtime assessment of cardiac function and complications and evaluation of regional abnormal wall motion in patients with AMI. Therefore, it is widely used in cardiology. However, depending on disease range and severity, regional abnormal wall motion can be difficult to recognize. Moreover, the accuracy of its recognition depends on sonographer experience. a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 hospital. A total of 202 cines were collected as inputs: 99 diagnoses of acute anteroseptal infarction of the proximal left anterior descending artery in the American Heart Association Committee Report and 103 normal cases. Cardiologists and experienced sonographers usually estimate the culprit coronary artery in patients with myocardial infarction using left ventricular long-axis, left ventricular short-axis, apical four-chamber, apical two-chamber, and apical long-axis views. Here we employed the short-axis PM level and left ventricular long-axis views because it is relatively easy to detect anteroseptal infarction of the proximal left anterior descending artery, which can be used to evaluate abnormal wall motion in cases of anteroseptal infarction. Moreover, all cases of anteroseptal infarction used in this study underwent coronary angiography, and occlusion of #6 was observed by examination. In addition, patients who underwent percutaneous coronary intervention after coronary angiography and had abnormal wall motion on echocardiography were included. Fig 2 shows the views used. Table 1    population, respectively. P values indicate differences between patients who have normal myocardium and AMI. P < 0.05 was considered statistically significant. The image preprocessing involved electrocardiogram (ECG) removal, cropping, and frame interpolation of the views. On echocardiography, a two-lead ECG is drawn to identify indicators during the cardiac cycle such as end-diastole and end-systole. Fig 4 shows the preprocessing of the input images. In this study, to recognize the cardiac wall motion on each view, the ECG was removed and trimmed to form a bounding rectangle. To consider differences in heart rate between patients, one cardiac cycle was extracted from each image and the number of frames was interpolated to 30. Interpolation means that if the number of frames in the video image for one cardiac cycle was 50-60, they were interpolated at equal intervals so that the number of frames was 30, whereas if the number of frames was 10-20, they were interpolated so that the total number of echocardiography images for all patients was 30. Linear interpolation was used as the complementation method, and each of the images for one cardiac cycle was extracted based on the length between the peaks (R-R interval) of the simultaneously recorded ECG. This study was approved by an institutional review board of Fujita Health University and informed consents were obtained from patients subject to the condition of data anonymization (No. HM19-345).

Feature extraction
The features of the interpolated images were extracted for input into the classification model [23]. CNNs can extract features simultaneously as the final outputs or extract parameters as intermediate outputs from individual layers. Varshni et al. used CNNs to extract features from chest radiographs [24]. Hyeon et al. used CNNs to extract features from cytology images and used conventional machine learning methods to differentiate between benign and malignant cells [24,25]. By using CNNs as feature extractors and other models as output layers, new inputs can be added and the accuracy further improved compared to CNN use alone. Therefore, we focused on this method and adopted VGG16 and global pooling [26], another CNN model, as feature extractors. Using these feature extraction methods, we extracted features from all frames of the interpolated echocardiography images.
VGG16 model. This study used VGG16, a CNN model developed by the Visual Geometry Group at the University of Oxford in 2014, for the feature extraction. The structure of VGG16 is shown in Fig 5. It consists of 13 convolutional layers, 5 pooling layers, and 3 fully connected layers [27]. We introduced the VGG16 pretraining network using the large natural ImageNet image dataset. From the second fully connected layer of VGG16, 4096 features were extracted and input into the classification model. This process significantly reduces the number of dimensions from the original feature map parameters and prevents overfitting. In this study, we performed global pooling of the 7 × 7 × 512 feature maps extracted by VGG16 and output the average or maximum value from each feature map, which resulted in a total of 512 parameters.

LSTM networks
Because the detection of AMI on echocardiography requires the evaluation and analysis of wall motion over time, two-dimensional images with different time phases were input into the CNN. We also focused on RNNs, which are excellent tools for processing sequential data and effective for time-series information such as cine images and wave signals as well as text data and natural language. Therefore, this model is characterized by its ability to control sequential information. Fig 7 demonstrates the principle of RNN, where x, y, and h are the input, output, and weight of the hidden layer, respectively.
The RNN connects the layer at time (t) with the previous layer (t−1) and calculates the parameters in the hidden layer (h(t)) and the output according to the following equations: U, W, and V denote the weights calculated during training, and f(z) and g(z m ) denote the sigmoid and softmax functions, respectively. The equations for the respective activation functions are as follows: However, because RNNs theoretically store all past data during training, the vanishing gradient problem arises due to the divergence and disappearance of weights. Therefore, we focused on LSTM, an improved RNN model. The principle of the LSTM is shown in Fig 8. The difference between LSTM and RNN is that LSTM features a mechanism of information selection called "gate" and "cell." There are 3 types of gates: "input," "output," and "forgetting." Eq (i) shows the formula for the forgetting gate (f t ): From the input information, the output of the LSTM layer at (t−1) and the cell, the information that is unnecessary in the learning at (t), is selected and "forgotten." Eq (ii) is used to determine the input gate (I t ): The output of the last LSTM layer and the value of the cell are used to determine the new value

PLOS ONE
Deep learning for acute myocardial infarction to be updated. The value of the updated cell (C t ) is then determined by Eq (iii): The value of the cell determined by the above equation is propagated to the next LSTM layer, and the output (O t ) of the LSTM layer at (t) is determined by Eqs (iv) and (v) in the output

PLOS ONE
Deep learning for acute myocardial infarction gate section: Using this gating mechanism, LSTM can analyze long-term series data and solve the vanishing gradient problem of conventional RNNs. We introduced these mechanisms to analyze the wall motion over time using echocardiography. Finally, as shown in Fig 9, the features extracted by the VGG16 method were input into the LSTM to classify the normal and AMI cases. For the hyperparameters, we set the learning rate to 1 × 10 −5 , the number of epochs to 50, the batch size to 30, and the input data size to 4096 × 30.

Evaluation
Cross-validation method. The cross-validation method was used to evaluate the classification accuracy of the constructed model. Fig 10 shows a simplified diagram of the cross-validation method. All datasets were initially divided into several groups, one of which was used as the test group for the evaluation, while the remaining data were used for training. Thereafter, the accuracy was comprehensively calculated by repeating the process such that all data are test data. In this study, we used a five-fold cross-validation method in which the 202 echocardiography images were divided into 142 cases for training, 20 cases for evaluation during training, and the remaining 40 cases for testing, with random sampling so that all cases were used as test data.
Comparison with conventional artificial neural network. To demonstrate the effectiveness of our method, the classification accuracy was also evaluated using five-fold cross-validation on a normal artificial neural network (ANN), which does not feature a mechanism to handle time-series relations separately from LSTM [28,29]. A schematic of the ANN is shown in Fig 11, in which all features extracted by the VGG16 method for each frame were combined and used as input to the neural network to classify the normal and AMI cases. For the hyperparameters, we set the learning rate to 1 × 10 −5 , the number of epochs to 50, the batch size to 30, and the input data size to 4096 × 30. Tables 2 and 3 show the confusion matrices and overall classification accuracies of LSTM and ANN for the LX images, while Tables 4 and 5 show the results for the PM images. Table 6 shows the classification accuracy for the LX and PM images for the given parameters and classifiers. Tables 7 and 8 show the sensitivities, specificities, and area under the curves (AUC) for

Discussion
In this study, we proposed an automated classification scheme for AMI and normal cases on echocardiography images using deep learning. The VGG16 method was used to extract features from the echocardiography images, while LSTM was used for the classification. The comparison of the classification models (Tables 2 and 3) shows that the results obtained with LSTM were better than those obtained using the ANN. The overall classification accuracy using LSTM was 0.852 for the LX images and 0.832 for the PM images. These results suggest that LSTM can classify AMI and normal cases with higher accuracy than ANN and analyze and classify useful features over time. In addition, the classification accuracy of LSTM suggests that the image information of one cardiac cycle (consisting of 30 frames) is useful for analyzing myocardial motion, thereby distinguishing AMI from normal myocardium. Unlike ordinary ANNs, LSTMs have a mechanism known as "gates," optimizing them for time-series information analysis. The results showed that the LSTM can detect AMI on echocardiography with better accuracy than ANN by analyzing time series information; moreover, the results confirmed the superiority of the proposed LSTM method. Further, an AI-based study on laryngitis pathology classification showed that the LSTM classification accuracy was 15% higher than regular ANNs [30]. Other studies using AI-based solar radiation prediction have also shown that LSTM has superior accuracy in predicting solar radiation [31]. In this study by comparison, the classification accuracy was improved using LSTM, resulting in more effective use of time series data, once again confirming the effectiveness of this method. Table 4 shows that the overall classification accuracy of LSTM was best on the LX and PM images when GAP was used. The classification accuracy on the LX images was 0.896, while that on the PM images was 0.867. These results suggest that GAP extracts more useful features for classification than GMP. In addition, the results of the comparison between the features extracted from the fully connected layer and the 512 features extracted by GAP showed that the classification accuracy of LSTM increased when GAP was used. This finding suggests that GAP reduces the number of unnecessary features for classification and achieves more efficient learning by reducing the number of parameters. The reason for the lack of change in classification accuracy between GAP and GMP when LSTM was used in the LX images may be that similar parameters were extracted from the feature maps by GAP and GMP during the pooling process. Visual comparison of the incorrectly and correctly classified cases showed that those with low video contrast, high noise, or high brightness of the myocardium were misclassified. These results suggest that image quality, such as noise and contrast in the video image, is among the most important factors in the classification of AMI and normal cases on echocardiography. In addition, incorrect cases tended not to be shown adequately in the image: the left ventricle was blurred and a different short-axis level view was shown. In future work, accuracy should be improved by the analysis of echocardiography images and patient data from more facilities to create a robust network model. Similar studies are listed in Table 9 for comparison with our study. Since there are very few studies with the same images and objectives, a simple comparison with this study may be difficult. However, our method was able to classify MI with an accuracy of more than 80% using 202 cases, confirming its validity.
We then calculated the accuracy of the classification using left ventricular LX and short-axis PM level views, which were subsequently used to detect anteroseptal infarction. Acute anteroseptal infarction was evaluated by cardiologists and experienced sonographers using left ventricular LX and short-axis PM level views as well as apical four-chamber and apical LX views, allowing for observation of the apex. However, inexperienced clinicians, non-cardiologists, residents, and those otherwise unfamiliar with echocardiography may find it difficult to obtain apical four-chamber and apical LX view images with adequate quality. In addition, the

PLOS ONE
Deep learning for acute myocardial infarction detection of AMI is an emergent matter and requires accurate and rapid detection using left ventricular LX and short-axis PM level views, which are relatively easy to obtain. These results indicate that this method correctly identifies acute anteroseptal infarction with superior accuracy clearly distinguishing it from normal myocardium. Therefore, this method can greatly assist non-cardiologists and inexperienced clinicians alike to diagnose acute anteroseptal infarction during initial treatment.

Limitations and future works
This study has a few limitations. First, the echocardiograms were performed at the same institution. Second, this classification method is performed offline; therefore, it is necessary to apply it to real-time processing so that classification during echocardiography can be utilized. Third, we did not evaluate each segment of the heart individually; rather, we examined only the acute anterior wall septal infarction with occlusion of #6 in the American Heart Association Committee Report, which occurs the most frequently. Since this study focused only on acute anteroseptal infarction, the classification and evaluation of infarcts in each segment and in other coronary dominant regions should be performed in the future.

Conclusion
In this study, we developed an automatic detection scheme for AMI on echocardiography images using CNN and LSTM. The accuracy of the classification showed that our proposed method was able to classify AMI and normal cases with high accuracy, confirming its effectiveness as a supplemental tool for the detection of AMI on echocardiography. Here specialists and skilled doctors can easily detect an anteroseptal infarction. However, it may be difficult for residents and physicians who are unfamiliar with echocardiography at the time of the initial visit or physicians in non-cardiology clinics to detect it. Anteroseptal infarctions occur frequently and require accurate detection and diagnosis, regardless of the technician or physician's experience, field, or situation. This method can contribute to the detection of AMI and is expected to lead to its appropriate treatment of and the prognosis of affected patients. Another technical novelty of this study is the use of LSTM, which enables a time-series analysis of wall motion on echocardiography. The results showed that LSTM can detect AMI more accurately than ANNs without a time-series analysis function, confirming its superiority using LSTM. In addition, here we used left ventricular long-axis and short-axis views (papillary muscle level), which are minimal and easy to depict for diagnosis, as input. Since the classifications were performed using highly accurate views, we found the possibility of applying this method using LSTM to other views, which are easy to take. In addition, although this study focused only on acute anteroseptal infarction, its methodology is expected to be extended to the detection of infarcts in other coronary artery dominant regions.